Bayesian variable order Markov models
نویسنده
چکیده
We present a simple, effective generalisation of variable order Markov models to full online Bayesian estimation. The mechanism used is close to that employed in context tree weighting. The main contribution is the addition of a prior, conditioned on context, on the Markov order. The resulting construction uses a simple recursion and can be updated efficiently. This allows the model to make predictions using more complex contexts, as more data is acquired, if necessary. In addition, our model can be alternatively seen as a mixture of tree experts. Experimental results show that the predictive model exhibits consistently good performance in a variety of domains. We consider Bayesian estimation of variable order Markov models (see Begleiter et al., 2004, for an overview). Such models create a tree of partitions, where the disjoint sets of every partition correspond to different contexts. We can associate a sub-model or expert with each context in order to make predictions. The main contribution of this paper is a conditional prior on the Markov order—or equivalently the context depth. This is based on a recursive construction that estimates, for each context at a certain depth k, whether it makes better predictions than the predictions of contexts at depths smaller than k. This simple model defines a mixture of variable order Marko models and its parameters can be updated in closed form in time O (D) for trees of depth D with each new observation. For unbounded length contexts, the complexity of the algorithm is O ( T 2 ) for an input sequence of length T . Furthermore, it exhibits robust performance in a variety of tasks. Finally, the model is easily extensible to controlled processes. Appearing in Proceedings of the 13 International Conference on Artificial Intelligence and Statistics (AISTATS) 2010, Chia Laguna Resort, Sardinia, Italy. Volume 9 of JMLR: W&CP 9. Copyright 2010 by the authors. The following section presents our setup for the sequence prediction problem and introduces notation. Section 2 defines the proposed Bayesian variable order Markov model (henceforth BVMM) and derives a closed form sequential update for the model parameters. Section 3 gives a brief overview of other tree based models, as well as a standard Markov chain mixture, and discusses their relation to BVMMs. Experimental comparisons between various models are given in Section 4, on a number of sequential prediction problems. The paper concludes with Section 5, which also discusses extensions and applications to reinforcement learning.
منابع مشابه
Comparison of Nml and Bayesian Scoring Criteria for Learning Parsimonious Markov Models
Parsimonious Markov models, a generalization of variable order Markov models, have been recently introduced for modeling biological sequences. Up to now, they have been learned by Bayesian approaches. However, there is not always sufficient prior knowledge available and a fully uninformative prior is difficult to define. In order to avoid cumbersome cross validation procedures for obtaining the...
متن کاملVOMBAT: prediction of transcription factor binding sites using variable order Bayesian trees
Variable order Markov models and variable order Bayesian trees have been proposed for the recognition of transcription factor binding sites, and it could be demonstrated that they outperform traditional models, such as position weight matrices, Markov models and Bayesian trees. We develop a web server for the recognition of DNA binding sites based on variable order Markov models and variable or...
متن کاملRecognition of cis-Regulatory Elements with Vombat
Variable order Markov models and variable order Bayesian trees have been proposed for the recognition of cis-regulatory elements, and it has been demonstrated that they outperform traditional models such as position weight matrices, Markov models, and Bayesian trees for the recognition of binding sites in prokaryotes. Here, we study to which degree variable order models can improve the recognit...
متن کاملIdentification of transcription factor binding sites with variable-order Bayesian networks
MOTIVATION We propose a new class of variable-order Bayesian network (VOBN) models for the identification of transcription factor binding sites (TFBSs). The proposed models generalize the widely used position weight matrix (PWM) models, Markov models and Bayesian network models. In contrast to these models, where for each position a fixed subset of the remaining positions is used to model depen...
متن کاملLearning Bayesian Network Structure using Markov Blanket in K2 Algorithm
A Bayesian network is a graphical model that represents a set of random variables and their causal relationship via a Directed Acyclic Graph (DAG). There are basically two methods used for learning Bayesian network: parameter-learning and structure-learning. One of the most effective structure-learning methods is K2 algorithm. Because the performance of the K2 algorithm depends on node...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010